Before we begin analysing such data, we first need to create a mathematical framework to work in. Fortunately, we do not need anything too complicated, and for a finite time-series of length \(N\), we model the time series as a sequence of \(N\) random variables, \(X_i\), with \(i = 1, 2, ..., N\).
Note that each individual \(X_i\) is a wholly separate random variable: we only ever have a single measurement from each random variable. In many cases we simplify this, but it is important to understand and appreciate that such simplifications are just that. Time series are difficult to analyse.
Before we get to any of that though, and before we try to build any kind of models for the data, we always start with visualising the data. Often, a simple plot of the data helps use pick out aspects to analyse and incorporate into the models. For time series, one of the first things to do is the time plot, a simple plot of the data over time.
For the passenger data, a few aspects stand out that are very common in time series. It is apparent that the numbers increase over time, and this systematic change in the data is called the trend. Often, approximating the trend as a linear function of time is adequate for many data sets.
A famous example of a time-series is count of airline passengers in the US, as shown in the figure below. This is a simple univariate time-series, with measurements taken on a monthly basis over a number of years, each datum consisting of a single number - the number of passengers travelling via a commercial flight that month.
plot(AirPassengers)A repeating pattern in the data that occurs over the period of the data (in this case, each year), is called the seasonal variation, though a more general concept of ‘season’ is implied — it often will not coincide with the seasons of the calendar.
A slightly more generalised concept from the seasonality is that of cycles, repeating patterns in the data that do not correspond to the natural fixed periods of the model. None of these are apparent in the air passenger data, and accounting for them are beyond the scope of this introductory tutorial.
Finally, another important benefit of visualising the data is that it helps identify possible outliers and erroneous data.
In many cases, we will also be dealing with time series that have multiple values at all, many or some of the points in time.
Often, these values will be related in some ways, and we will want to analyse those relationships also. In fact, one of the most efficient methods of prediction is to find leading indicators for the value or values you wish to predict — you can often use the current values of the leading indicators to make inference on future values of the related quantities.
The fact that this is one of the best methods in time series analysis says a lot about the difficulty of prediction (Yogi Berra, a US baseball player noted for his pithy statements, once said “Prediction is difficult, especially about the future”).
In this workshop we will look at a number of different time-series, discussed here.
This data comes in a few different format, and this workshop discusses methods for analysing this data in a common format.
Before we get to any of that though, and before we try to build any kind of models for the data, we always start with visualising the data. Often, a simple plot of the data helps use pick out aspects to analyse and incorporate into the models. For time series, one of the first things to do is the , a simple plot of the data over time.
For the passenger data, a few aspects stand out that are very common in time series. It is apparent that the numbers increase over time, and this systematic change in the data is called the . Often, approximating the trend as a linear function of time is adequate for many data sets.
A repeating pattern in the data that occurs over the period of the data (in this case, each year), is called the , though a more general concept of `season’ is implied — it often will not coincide with the seasons of the calendar.
A slightly more generalised concept from the seasonality is that of , repeating patterns in the data that do not correspond to the natural fixed periods of the model. None of these are apparent in the air passenger data, and accounting for them are beyond the scope of this introductory tutorial.
Finally, another important benefit of visualising the data is that it helps identify possible and data.
ts} object usingstr(). How is atsobject different from a standard vector in R? Plot it using the default \text{plot} method. <p> #### Worked Exercise 1.2 Using the data supplied in the fileMaine.dat} and the function read.table(), load the Maine unemployment data into your workspace and repeat the tasks above.
aggregate() and cycle() functions. Create a boxplot for the data, segmenting the data by month.
Using the window() function, calculate quantitative values for the above.
Load the air passengers data into your workspace and investigate the structure of the object using str(). How is a object different from a standard vector in R? Plot it using the default plot() method.
class(AP.ts);[1] "ts"
str(AP.ts); Time-Series [1:144] from 1949 to 1961: 112 118 132 129 121 135 148 148 136 119 ...
start(AP.ts); end(AP.ts); frequency(AP.ts);[1] 1949 1
[1] 1960 12
[1] 12
plot(AP.ts, ylab = "Air Passengers (\'000s)");### Using ggplot2 looks better, but you have to work hard for the
### labels on the x-axis so I am leaving this out for now.
qplot(1:length(AP.ts), as.vector(AP.ts), geom = 'line', ylab = 'Air Passengers (\'000s)');Using the data supplied in the file and the function read.table(), load the Maine unemployment data into your workspace and repeat the tasks above.
class(MA.month.ts);[1] "ts"
str(MA.month.ts); Time-Series [1:128] from 1996 to 2007: 6.7 6.7 6.4 5.9 5.2 4.8 4.8 4 4.2 4.4 ...
start(MA.month.ts); end(MA.month.ts); frequency(MA.month.ts);[1] 1996 1
[1] 2006 8
[1] 12
plot(MA.month.ts, ylab = "Unemployment data for the state of Maine");### Using ggplot2 looks better, but you have to work hard for the
### labels on the x-axis so I am leaving this out for now.
qplot(1:length(MA.month.ts), as.vector(MA.month.ts), geom = 'line', ylab = 'Unemployment data for the state of Maine');Analyse the trend and seasonality for the air passenger data by using the aggregate() and cycle() functions. Create a boxplot for the data, segmenting the data by month.
### We are going to aggregate over the years, and extract the cycles
AP.year.ts <- aggregate(AP.ts);
AP.cycle.ts <- cycle(AP.ts);### We are going to stack the two plots together
layout(1:2)
plot(AP.year.ts)
boxplot(AP.ts ~ AP.cycle.ts)### Create a plot in ggplot2
#plot1 <- qplot(start(AP.year.ts)[1]:end(AP.year.ts)[1], as.vector(AP.year.ts), geom = 'line', xlab = 'Year', ylab = 'Yearly Aggregates')
#plot2 <- qplot(cycle, AP, data = data.frame(cycle = as.factor(AP.cycle.ts), AP = as.vector(AP.ts)), geom = 'boxplot', xlab = 'Month', ylab = 'Passengers');
#grid.arrange(plot1, plot2);Repeat the above analysis for Maine unemployment data.
### We are going to aggregate over the years, and extract the cycles
MA.year.ts <- aggregate(MA.month.ts);
MA.cycle.ts <- cycle(MA.month.ts);### We are going to stack the two plots together
layout(1:2)
plot(MA.year.ts)
boxplot(MA.month.ts ~ MA.cycle.ts)### Create a plot in ggplot2
#plot1 <- qplot(start(MA.year.ts)[1]:end(MA.year.ts)[1], as.vector(MA.year.ts), geom = 'line', xlab = 'Year', ylab = 'Yearly Aggregates')
#plot2 <- qplot(MA.cycle.ts, MA.month.ts, data = data.frame(cycle = as.factor(MA.cycle.ts), MA = as.vector(MA.month.ts)), geom = 'boxplot', xlab = 'Month', ylab = 'Passengers');
#grid.arrange(plot1, plot2);Calculate the average monthly data for each of the above time series. Compare this to the actual monthly data and plot them together. What can we learn from this?
MA.year.ts <- aggregate(MA.month.ts);
MA.annual.mean.ts <- MA.year.ts / 12;layout(1:2)
plot(MA.month.ts, ylab = "unemployed (%)")
plot(MA.annual.mean.ts, ylab = "unemployed (%)")### We can also plot this in ggplot2
MA.month.vec <- as.vector(MA.month.ts);
MA.annual.mean.vec <- as.vector(MA.annual.mean.ts);qplot(1:length(MA.month.vec), MA.month.vec, geom = 'line', colour = I('red')) +
geom_line(aes(x = -6 + (1:length(MA.annual.mean.vec)) * 12, y = MA.annual.mean.vec), colour = 'blue');Using the function, calculate quantitative values for the above.
MA.02.ts <- window(MA.month.ts, start = c(1996, 2), freq = TRUE);
MA.08.ts <- window(MA.month.ts, start = c(1996, 8), freq = TRUE);
head(MA.02.ts); tail(MA.02.ts);[1] 6.7 6.5 5.7 5.0 4.4 4.2
[1] 4.2 4.9 5.8 5.6 5.8 5.6
str(MA.02.ts); Time-Series [1:11] from 1996 to 2006: 6.7 6.5 5.7 5 4.4 4.2 4.9 5.8 5.6 5.8 ...
Feb.ratio <- mean(MA.02.ts) / mean(MA.month.ts);
Feb.ratio[1] 1.222529
Aug.ratio <- mean(MA.08.ts) / mean(MA.month.ts);
Aug.ratio[1] 0.8163732
In many cases, we will also be dealing with time series that have multiple values at all, many or some of the points in time.
Often, these values will be related in some ways, and we will want to analyse those relationships also. In fact, one of the most efficient methods of prediction is to find leading indicators for the value or values you wish to predict — you can often use the current values of the leading indicators to make inference on future values of the related quantities.
The fact that this is one of the best methods in time series analysis says a lot about the difficulty of prediction (Yogi Berra, a US baseball player noted for his pithy statements, once said ``Prediction is difficult, especially about the future’’).
Load in the multivariate data from the file cbe.dat. Investigate the object type and some sample data to get an idea of how it is structured. The R functions head() and tail() will be of use for this.
Create time series objects for this data using ts(), and plot them beside each other. cbind() is useful for creating all the plots together.
Merge the electricity usage data with the US airline passenger data using ts.intersect and investigate any possible similarities between the two time series.
Use the cor() function, investigate the correlation between the two time series. How plausible is a causal effect in this case?
head(CBE.df) choc beer elec
1 1451 96.3 1497
2 2037 84.4 1463
3 2477 91.2 1648
4 2785 81.9 1595
5 2994 80.5 1777
6 2681 70.4 1824
tail(CBE.df) choc beer elec
391 7529 148.3 14002
392 8715 148.3 14338
393 8450 133.5 12867
394 9085 193.8 12761
395 8350 208.4 12449
396 7080 197.0 12658
str(CBE.df)'data.frame': 396 obs. of 3 variables:
$ choc: int 1451 2037 2477 2785 2994 2681 3098 2708 2517 2445 ...
$ beer: num 96.3 84.4 91.2 81.9 80.5 70.4 74.8 75.9 86.3 98.7 ...
$ elec: int 1497 1463 1648 1595 1777 1824 1994 1835 1787 1699 ...
beer.ts <- ts(CBE.df$beer, start = 1958, freq = 12);
choc.ts <- ts(CBE.df$choc, start = 1958, freq = 12);
elec.ts <- ts(CBE.df$elec, start = 1958, freq = 12);plot(cbind(beer.ts, choc.ts, elec.ts));elec.ts <- ts(CBE.df$elec, start = 1958, freq = 12);
AP.elec.ts <- ts.intersect(AP.ts, elec.ts);head(AP.elec.ts); tail(AP.elec.ts); AP.ts elec.ts
[1,] 340 1497
[2,] 318 1463
[3,] 362 1648
[4,] 348 1595
[5,] 363 1777
[6,] 435 1824
AP.ts elec.ts
[31,] 622 2287
[32,] 606 2276
[33,] 508 2096
[34,] 461 2055
[35,] 390 2004
[36,] 432 1924
str(AP.elec.ts); Time-Series [1:36, 1:2] from 1958 to 1961: 340 318 362 348 363 435 491 505 404 359 ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr [1:2] "AP.ts" "elec.ts"
plot(AP.elec.ts);qplot(Var1, value, data = melt(AP.elec.ts), geom = 'line', colour = Var2);elec.ts <- ts(CBE.df$elec, start = 1958, freq = 12);
AP.elec.ts <- ts.intersect(AP.ts, elec.ts);AP.elec.cor <- cor(AP.elec.ts);
str(AP.elec.cor); num [1:2, 1:2] 1 0.884 0.884 1
- attr(*, "dimnames")=List of 2
..$ : chr [1:2] "AP.ts" "elec.ts"
..$ : chr [1:2] "AP.ts" "elec.ts"
### Show the scaled plot
qplot(Var1, value, data = melt(scale(AP.elec.ts)), geom = 'line', colour = Var2);Since many time series are dominated by trends or seasonal effects, and we can create fairly simple models based on these two components. The first of these, the , is just the sum of these effects, with the residual component being treated as random:
\[ x_t = m_t + s_t + z_t, \]
where, at any given time \(t\), \(x_t\) is the observed value, \(m_t\) is trend, \(s_t\) is the seasonal component, and \(z_t\) is the error term.
It is worth noting that, in general, the error terms will be a correlated sequence of values, something we will account for and model later.
In other cases, we could have a situation where the seasonal effect increases as the trend increases, modeling the values as:
\[ x_t = m_t s_t + z_t. \]
Other options also exist, such as modeling the log of the observed values, which does cause some non-trivial modeling issues, such as biasing any predicted values for the time series.
Various methods are used for estimating the trend, such as taking a of the values, which is a common approach.
Using the decompose() function in R, look at the trend and the seasonal variation for the airline passenger data. The output of this function can be plotted directly, and visually check the output. Does the output match your intuition about what you observed?
Repeat this process for the CBE dataset.
Try a multiplicative model for all of the above. decompose() allows the selection of this via the ```type}’ parameter. Is the multiplicative model better? In either case, explain why this might be.
Repeat the above, but use the stl() R function instead of decompose(). Compare the output of the two.
Using the decompose() function in R, look at the trend and the seasonal variation for the airline passenger data. The output of this function can be plotted directly, and visually check the output. Does the output match your intuition about what you observed?
AP.ts.decomp <- decompose(AP.ts);
str(AP.ts.decomp);List of 6
$ x : Time-Series [1:144] from 1949 to 1961: 112 118 132 129 121 135 148 148 136 119 ...
$ seasonal: Time-Series [1:144] from 1949 to 1961: -24.75 -36.19 -2.24 -8.04 -4.51 ...
$ trend : Time-Series [1:144] from 1949 to 1961: NA NA NA NA NA ...
$ random : Time-Series [1:144] from 1949 to 1961: NA NA NA NA NA ...
$ figure : num [1:12] -24.75 -36.19 -2.24 -8.04 -4.51 ...
$ type : chr "additive"
- attr(*, "class")= chr "decomposed.ts"
plot(AP.ts.decomp);Repeat this process for the CBE dataset.
beer.ts <- ts(CBE.df$beer, start = 1958, freq = 12);
choc.ts <- ts(CBE.df$choc, start = 1958, freq = 12);
elec.ts <- ts(CBE.df$elec, start = 1958, freq = 12);beer.ts.decomp <- decompose(beer.ts);
choc.ts.decomp <- decompose(choc.ts);
elec.ts.decomp <- decompose(elec.ts);cbe.ts <- cbind(beer.ts, choc.ts, elec.ts);
plot(cbe.ts);plot(beer.ts.decomp);plot(choc.ts.decomp);plot(elec.ts.decomp);Try a multiplicative model for all of the above. decompose() allows the selection of this via the ```type}’ parameter. Is the multiplicative model better? In either case, explain why this might be.
AP.ts.mult.decomp <- decompose(AP.ts, type = 'multiplicative');
plot(AP.ts.mult.decomp);Repeat the above, but use the stl() R function instead of decompose(). Compare the output of the two.
AP.ts.stl <- stl(AP.ts, s.window = 'periodic');
str(AP.ts.stl);List of 8
$ time.series: Time-Series [1:144, 1:3] from 1949 to 1961: -25.5 -35.22 -3.03 -8.3 -5.74 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : NULL
.. ..$ : chr [1:3] "seasonal" "trend" "remainder"
$ weights : num [1:144] 1 1 1 1 1 1 1 1 1 1 ...
$ call : language stl(x = AP.ts, s.window = "periodic")
$ win : Named num [1:3] 1441 19 13
..- attr(*, "names")= chr [1:3] "s" "t" "l"
$ deg : Named int [1:3] 0 1 1
..- attr(*, "names")= chr [1:3] "s" "t" "l"
$ jump : Named num [1:3] 145 2 2
..- attr(*, "names")= chr [1:3] "s" "t" "l"
$ inner : int 2
$ outer : int 0
- attr(*, "class")= chr "stl"
plot(AP.ts.stl);### To compare the two, I give an example of comparing the two calculated trend
plot.df <- melt(cbind(decomp = AP.ts.decomp$trend, stl = AP.ts.stl$time.series[, 'trend']));
head(plot.df); Var1 Var2 value
1 1 decomp NA
2 2 decomp NA
3 3 decomp NA
4 4 decomp NA
5 5 decomp NA
6 6 decomp NA
qplot(Var1, value, data = plot.df, colour = Var2, geom = 'line', ylim = c(0, 500));Assuming we can remove the trend and the seasonal variation, that still leaves the random component, \(z_t\). Unfortunately, analysing this is usually highly non-trivial. As discussed, we model the random component as a sequence of random variables, but no further assumptions we made.
To simplify the analysis, we often make assumptions like random variables, but this will rarely work well. Most of the time, the \(z_t\) are correlated.
The or of a random variable \(x\), denoted \(E(x)\), is the mean value of \(x\) in the population. So, for a continuous \(x\), we have
\[ \mu = E(x) = \int p(x) \, x \, dx. \]
and the , \(\sigma^2\), is the expectation of the squared deviations,
\[ \sigma^2 = E[(x - \mu)^2], \]
For bivariate data, each datapoint can be represented as \((x, y)\) and we can generalise this concept to the , \(\gamma(x, y)\),
\[\begin{equation} \gamma(x, y) = E[(x - \mu_x)(y - \mu_y)]. \end{equation}\]Correlation, \(\rho\), is the standardised covariance, dividing the covariance by the standard deviation of the two variables,
\[\begin{equation} \rho(x, y) = \frac{\gamma(x, y)}{\sigma_x \sigma_y}. \end{equation}\]The mean function of a time series model is
\[\begin{equation} \mu(t) = E(x_t), \end{equation}\]with the expectation in this case being across the of possible time series that might have been produced by this model. Of course, in many cases, we only have one realisation of the model, and so, without any further assumption, estimate the mean to be the measured value.
If the mean function is constant, we say that the time-series is , and the estimate of the population mean is just the sample mean,
\[\begin{equation} \mu = \sum^n_{t=1} x_t. \end{equation}\]The variance function of a time-series model that is stationary in the mean is given by
\[\begin{equation} \sigma^2(t) = E[(x_t - \mu)^2], \end{equation}\]and if we make the further assumption that the time-series is also stationary in the variance, then the population variance is just the sample variance
\[\begin{equation} \text{Var}(x) = \frac{\sum(x_t - \mu)^2}{n - 1} \end{equation}\]Autocorrelation, often referred to as , is the correlation between the random variables at different time intervals. We can define the and the as functions of the , \(k\), as
\[\begin{eqnarray} \gamma_k &=& E[(x_t - \mu)(x_{t+k} - \mu)], \\ \rho_k &=& \frac{\gamma_k}{\sigma^2}. \end{eqnarray}\]Be default, the acf() function plots the , which is a plot of the sample autocorrelation at \(r_k\) against the lag \(k\).
Using the function acf(), calculate the autocorrelations for all the time series we have looked at. Look at the structure of the output, and use the help system to see what options are provided.
Check the output of acf() against manual calculations of the correlations at various timesteps. Do the numbers match?
The cor() function and some vector indexing will be helpful here.
Plot the output of the acf() for the different time series. Think about what these plots are telling you. Do do these plots help the modelling process, if so, how?
Decompose the air passenger data and look at the appropriate correlogram. What does this plot tell you? How does it differ from the previous correlogram you looked at?
How can we use all that we have learned so far to assess the efficacy of the decompositional approach for time series?
AP.ts.acf <- acf(AP.ts, plot = FALSE);
str(AP.ts.acf);List of 6
$ acf : num [1:22, 1, 1] 1 0.948 0.876 0.807 0.753 ...
$ type : chr "correlation"
$ n.used: int 144
$ lag : num [1:22, 1, 1] 0 0.0833 0.1667 0.25 0.3333 ...
$ series: chr "AP.ts"
$ snames: NULL
- attr(*, "class")= chr "acf"
AP.ts.acf <- acf(AP.ts, plot = FALSE);
calc.autocor <- function(x, lag) {
N <- length(x);
idx1 <- 1:(N-lag);
idx2 <- (lag+1):N;
cor(AP.ts[idx1], AP.ts[idx2]);
}### Adding 1 to the index of the acf as acf[1] is the lag at 0 (i.e. it has value 1)
lag <- 1;
AP.ts.acf$acf[lag+1][1] 0.9480473
calc.autocor(AP.ts, lag);[1] 0.9601946
lag <- 3;
AP.ts.acf$acf[lag+1][1] 0.8066812
calc.autocor(AP.ts, lag);[1] 0.8373948
lag <- 12;
AP.ts.acf$acf[lag+1][1] 0.760395
calc.autocor(AP.ts, lag);[1] 0.9905274
AP.ts.acf <- acf(AP.ts, plot = FALSE);
plot(AP.ts.acf, ylim = c(-1, 1));AP.ts.acf <- acf(AP.ts, plot = FALSE);AP.ts.decomp <- decompose(AP.ts);
AP.ts.decomp.random <- AP.ts.decomp$random[!is.na(AP.ts.decomp$random)]AP.ts.decomp.random.acf <- acf(AP.ts.decomp.random, plot = FALSE);layout(1:2);
plot(AP.ts.acf, ylim = c(-1, 1));
plot(AP.ts.decomp.random.acf, ylim = c(-1, 1))As mentioned earlier, an efficient way to forecast a variable is to find a related variable whose value leads it by one or more timesteps. The closer the relationship and the longer the lead time, the better it becomes.
The trick, of course, is to find a leading variable.
Multivariate series has a temporal equivalent to correlation and covariance, known as the cross-covariance function (ccvf) and the ,
\[\begin{eqnarray} \gamma_k(x, y) &=& E[(x_{t+k} - \mu_x)(y_t - \mu_y)], \\ \rho_k(x, y) &=& \frac{\gamma_k(x, y)}{\sigma_x \sigma_y}. \end{eqnarray}\]Note that the above functions are not symmetric, as the lag is always on the first variable, \(x\).
Load the building approvals and activity data from the ApprovActiv.dat file. The data is quarterly and starts in 1996. Determine which is the leading variable and investigate the relationship between the two.
ts.union(), find the cross-correlations for the building data.
Is the relationship symmetric, and why?
Examine the cross-correlations of the random element of the decomposed time-series for the building data, and compare this to the original cross-correlations.
Our main objective in forecasting is to estimate the value of a future quantity, \(x_{n+k}\), given past values \({x_1, x_2, ..., x_n}\). We assume no seasonal or trend effects, or any such effects have been removed from the data. We assume that the underlying mean of the data is \(\mu_t\), and that this value changes from timestep to timestep, but this change is random.
Our model can be expressed as
\[\begin{equation} x_t = \mu_t + w_t, \end{equation}\]where \(\mu_t\) is the non-stationary mean of the process at time \(t\) and \(w_t\) are independent random variates with mean \(0\) and standard deviation \(\sigma\). We let \(a_t\) be our estimate of \(\mu_t\), and can define the exponentially-weighted moving average (EWMA), \(a_t\) to be
\[\begin{equation} a_t = \alpha x_t + (1 - \alpha) a_{t-1}, \;\;\; 0 \leq \alpha \leq 1. \end{equation}\]The value of \(\alpha\) controls the amount of smoothing, as is referred to as the smoothing parameter.
Load the data in the motororg.dat file. This is a count of complaints received on a monthly basis by a motoring organisation from 1996 to 1999. Create an appropriate time series from this data. Plot the data, checking it for trends or seasonality.
Using the function HoltWinters(), with the additional parameters set to zero, create the EWMA of the data, allowing the function itself to choose the optimal value of \(\alpha\). Investigate and visualise the output, comparing it to the original time series.
Specifying values of \(\alpha\) of 0.2 and 0.9, create new versions of the EWMA and compare them with previous fits of the EWMA.
The Holt-Winters method generalises this concept, allowing for trends and seasonal effects. The equations that govern this model for seasonal period, \(p\), are given by
\[\begin{eqnarray} a_t &=& \alpha (x_t - s_{t-p}) + (1 - \alpha)(a_{t-1} - b_{t-1}), \nonumber \\ b_t &=& \beta (a_t - a_{t-1}) + (1 - \beta)b_{t-1},\\ s_t &=& \gamma (x_t - a_t) + (1 - \gamma) s_{t-p}, \nonumber \end{eqnarray}\]where \(a_t\), \(b_t\), \(s_t\) are the estimated level, slope and seasonal effect at time \(t\), and \(\alpha\), \(\beta\) and \(\gamma\) are the smoothing parameters.
Fit the Holt-Winters parameters to the air passenger data and check the fit. Visualise the raw time-series against the fitted data.
Predict data ahead for four years and visualise this data. How reliable are these forecasts do you think?
App.ts <- ts(AA.df$Approvals, start = c(1996,1), freq=4);
Act.ts <- ts(AA.df$Activity, start = c(1996,1), freq=4);
ts.plot(App.ts, Act.ts, lty = c(1, 3), ylim = c(0, 16000));AppAct.ts <- ts.union(App.ts, Act.ts);
AppAct.ts.acf <- acf(AppAct.ts, plot = FALSE);
plot(AppAct.ts.acf);App.ts.decomp <- decompose(App.ts);
App.rnd.ts <- window(App.ts.decomp$random, start = c(1996, 3));Act.ts.decomp <- decompose(Act.ts);
Act.rnd.ts <- window(Act.ts.decomp$random, start = c(1996, 3));AppAct.rnd.ts.acf <- acf(ts.union(App.rnd.ts, Act.rnd.ts), na.action = na.pass, plot = FALSE);
AppAct.rnd.ts.ccf <- ccf(App.rnd.ts, Act.rnd.ts, na.action = na.pass, plot = FALSE);plot(AppAct.rnd.ts.acf);plot(AppAct.rnd.ts.ccf);Complaints.ts <- ts(motororg.df$complaints, start = c(1996, 1), freq = 12)
plot(Complaints.ts, xlab = "Time / months", ylab = "Complaints");Complaints.ts.decomp <- decompose(Complaints.ts);
plot(Complaints.ts.decomp);Complaints.hw1 <- HoltWinters(Complaints.ts, beta = FALSE, gamma = FALSE);print(Complaints.hw1);Holt-Winters exponential smoothing without trend and without seasonal component.
Call:
HoltWinters(x = Complaints.ts, beta = FALSE, gamma = FALSE)
Smoothing parameters:
alpha: 0.1429622
beta : FALSE
gamma: FALSE
Coefficients:
[,1]
a 17.70343
plot(Complaints.hw1);Complaints.hw2 <- HoltWinters(Complaints.ts, alpha = 0.2, beta = FALSE, gamma = FALSE);
print(Complaints.hw2);Holt-Winters exponential smoothing without trend and without seasonal component.
Call:
HoltWinters(x = Complaints.ts, alpha = 0.2, beta = FALSE, gamma = FALSE)
Smoothing parameters:
alpha: 0.2
beta : FALSE
gamma: FALSE
Coefficients:
[,1]
a 17.97913
plot(Complaints.hw2);Complaints.hw3 <- HoltWinters(Complaints.ts, alpha = 0.9, beta = FALSE, gamma = FALSE);
print(Complaints.hw3);Holt-Winters exponential smoothing without trend and without seasonal component.
Call:
HoltWinters(x = Complaints.ts, alpha = 0.9, beta = FALSE, gamma = FALSE)
Smoothing parameters:
alpha: 0.9
beta : FALSE
gamma: FALSE
Coefficients:
[,1]
a 22.32544
plot(Complaints.hw3);layout(1:3);
plot(Complaints.hw1);
plot(Complaints.hw2);
plot(Complaints.hw3);AP.ts.add.hw <- HoltWinters(AP.ts, seasonal = "add");
plot(AP.ts.add.hw);AP.ts.mult.hw <- HoltWinters(AP.ts, seasonal = "mult");
plot(AP.ts.mult.hw);AP.ts.mult.hw.predict <- predict(AP.ts.mult.hw, n.ahead = 4 * 12);
ts.plot(AP.ts, AP.ts.mult.hw.predict, lty = 1:2);Now suppose we combine the ideas of autoregressive and moving average models together. A time series follows an autoregressive moving average (ARMA) process of order \((p, q)\) when
\[\begin{equation} x_t = \sum_{i=1}^p \alpha_i x_{t-i} + w_t + \sum_{j=1}^q \beta_j w_{t-j} \end{equation}\]where \(w_t\) is white noise.
Both \(\text{AR}(p)\) and \(\text{MA}(q)\) models are special cases of \(\text{ARMA}(p, q)\) (with \(q = 0\) and \(p = 0\) respectively), and ARMA models are usually preferred due to — when fitting data, the ARMA model is usually more parameter efficient, requiring few parameters.
Using the R function arima.sim(), create an ARMA(1,1) time-series of length 1000 with \(\alpha = -0.6\), and \(\beta = 0.5\). Plot this time series and check its ACF. Is this correct?
Using arima(), fit your generated time series to an \(\text{ARMA}(1,1)\) model and compare the fitted output to the values you have set.
Repeat the above exercises, but for an \(\text{ARMA}(2, 2)\) model with \(\alpha_1 = 0.2\), \(\alpha_2 = -0.5\), \(\beta_1 = -0.1\) and \(\beta_2 = 0.3\).
Investigate the effect of the parameters \(p\) and \(q\) by generating the various combinations of the \(\text{ARMA}(p, q)\) models but using the same set of innovations in each case.\ arima.sim() has a parameter innov = ... that allows you to pass in a set of innovations into the ARMA process.
Load in the GBP/NZD currency pair data from the file pound_nz.dat, and create a time-series from this. The data is quarterly, and starts in Q1 1991.
Fit the data GBP/NZD to an \(\text{MA}(1)\), an \(\text{AR}(1)\) and an \(\text{ARMA}(1, 1)\) process. Which of the above models does the best job at fitting the data?
It may be becoming quickly apparent that the choice of parameter count is non-trivial, and some kind of systematic approach would be desirable.
One such method for choosing the optimal number of parameters is to fit multiple options and choose the best one, using a metric known as the . The AIC is defined to be
\[\begin{equation} \text{AIC} = -2 \times \text{loglikelihood of fit} + 2 \times \text{parameter count}, \end{equation}\]so that it balances the better fitting of parameters while penalising using too many parameters to fit.
Use the R function AIC() to calculate the AIC of the three models above. Which one is the best? Why is this question a trap?
Using all of the various techniques in this workshop, try to model the electricity time series data from the CBE dataset.
Let us look at the Nile data set, one of R’s embedded data set.
NileTime Series:
Start = 1871
End = 1970
Frequency = 1
[1] 1120 1160 963 1210 1160 1160 813 1230 1370 1140 995 935 1110 994 1020
[16] 960 1180 799 958 1140 1100 1210 1150 1250 1260 1220 1030 1100 774 840
[31] 874 694 940 833 701 916 692 1020 1050 969 831 726 456 824 702
[46] 1120 1100 832 764 821 768 845 864 862 698 845 744 796 1040 759
[61] 781 865 845 944 984 897 822 1010 771 676 649 846 812 742 801
[76] 1040 860 874 848 890 744 749 838 1050 918 986 797 923 975 815
[91] 1020 906 901 1170 912 746 919 718 714 740
class(Nile)[1] "ts"
str(Nile) Time-Series [1:100] from 1871 to 1970: 1120 1160 963 1210 1160 1160 813 1230 1370 1140 ...
mode(Nile)[1] "numeric"
head(Nile)[1] 1120 1160 963 1210 1160 1160
library(TSA)
dev.new(width=8, height=4)
data(larain)
plot(larain,ylab='Inches',xlab='Year',type='o')zlag(larain,d=1) [1] NA 20.86 17.41 18.65 5.53 10.74 14.14 40.29 10.53 16.72 16.02 20.82
[13] 33.26 12.69 12.84 18.72 21.96 7.51 12.55 11.80 14.28 4.83 8.69 11.30
[25] 11.96 13.12 14.77 11.88 19.19 21.46 15.30 13.74 23.92 4.89 17.85 9.78
[37] 17.17 23.21 16.67 23.29 8.45 17.49 8.82 11.18 19.85 15.27 6.25 8.11
[49] 8.94 18.56 18.63 8.69 8.32 13.02 18.93 10.72 18.76 14.67 14.49 18.24
[61] 17.97 27.16 12.06 20.26 31.28 7.40 22.57 17.45 12.78 16.22 4.13 7.59
[73] 10.63 7.38 14.33 24.95 4.08 13.69 11.89 13.62 13.24 17.49 6.23 9.57
[85] 5.83 15.37 12.31 7.98 26.81 12.91 23.66 7.58 26.32 16.54 9.26 6.54
[97] 17.45 16.69 10.70 11.01 14.97 30.57 17.00 26.33 10.92 14.41 34.04 8.90
[109] 8.92 18.00 9.11 11.57 4.56 6.49 15.07
dev.new(width=7, height=7)
plot(y=larain,x=zlag(larain,d=1),ylab='Inches',
xlab='Previous Year Inches')
cor(y=larain,x=zlag(larain,d=1), use="pairwise")[1] -0.03308892
dev.new(width=8, height=4)
data(color)
plot(color,ylab='Color Property',xlab='Batch',type='o')zlag(color,d=1) [1] NA 67 63 76 66 69 71 72 71 72 72 83 87 76 79 74 81 76 77 68 68 74 68 69 75
[26] 80 81 86 86 79 78 77 77 80 76
dev.new(width=7, height=7)
plot(y=color,x=zlag(color,d=1),ylab='Color Property',
xlab='Previous Batch Color Property')
cor(y=color,x=zlag(color,d=1), use="pairwise")[1] 0.554917
dev.new(width=8, height=4)
data(hare)
plot(hare,ylab='Abundance',xlab='Year',type='o')zlag(hare,d=1) [1] NA 50 20 20 22 27 50 55 78 70 59 28 20 15 15 25 35 65 78 82 65 26 15 10 1
[26] 2 3 22 75 95 78
dev.new(width=7, height=7)
plot(y=hare,x=zlag(hare,d=1),ylab='Abundance',
xlab='Previous Year Abundance')
cor(y=hare,x=zlag(hare,d=1), use="pairwise")[1] 0.7025777
dev.new(width=7, height=7)
plot(y=hare,x=zlag(hare,d=2),ylab='Abundance',
xlab='Abundance Two Year\'s Ago')
cor(y=color,x=zlag(color,d=2), use="pairwise")[1] 0.3651124
dev.new(width=8, height=4)
data(tempdub)
plot(tempdub,ylab='Temperature',type='o')dev.new(width=8, height=4)
plot(tempdub,ylab='Temperature',type='l')
points(y=tempdub,x=time(tempdub),
pch=as.vector(season(tempdub)), col=4, cex=0.8 )aggregate(tempdub~season(tempdub), data=tempdub, mean) season(tempdub) tempdub
1 January 16.60833
2 February 20.65000
3 March 32.47500
4 April 46.52500
5 May 58.09167
6 June 67.50000
7 July 71.71667
8 August 69.33333
9 September 61.02500
10 October 50.97500
11 November 36.65000
12 December 23.64167
dev.new(width=7, height=7)
boxplot(tempdub~season(tempdub), ylab="Temperature")dev.new(width=7, height=7)
plot(y=tempdub,x=zlag(tempdub,d=1),ylab='Temperature',
xlab='Previous Month Temperature')
cor(y=tempdub,x=zlag(tempdub,d=1), use="pairwise")[1] 0.8395499
dev.new(width=8, height=4)
data(oilfilters)
plot(oilfilters,ylab='Sales',type='o')dev.new(width=8, height=4)
plot(oilfilters,ylab='Sales',type='l')
points(y=oilfilters,x=time(oilfilters),
pch=as.vector(season(oilfilters)), col=4, cex=0.8 )aggregate(oilfilters~season(oilfilters), data=oilfilters, mean) season(oilfilters) oilfilters
1 January 5505.75
2 February 5361.00
3 March 2978.00
4 April 4591.50
5 May 3705.50
6 June 3404.25
7 July 3102.00
8 August 2676.75
9 September 2271.75
10 October 2470.25
11 November 2178.25
12 December 2409.00
dev.new(width=7, height=7)
boxplot(oilfilters~season(oilfilters), ylab="Sales")dev.new(width=7, height=7)
plot(y=oilfilters,x=zlag(oilfilters,d=1),ylab='Sales',
xlab='Previous Month Sales')
cor(y=oilfilters,x=zlag(oilfilters,d=1), use="pairwise")[1] 0.3142145